4 research outputs found

    ENABLING REAL-TIME CERTIFICATION OF AUTONOMOUS DRIVING APPLICATIONS

    Get PDF
    The push towards fielding advanced driver-assist systems (ADASs) is happening at breakneck speed. Semi-autonomous features are becoming increasingly common, including adaptive cruise control and automatic lane keeping. Today, graphics processing units (GPUs) are seen as a key technology in this push towards greater autonomy. However, realizing full autonomy in mass-production vehicles will necessitate the use of stringent certification processes. Unfortunately, currently available GPUs tend to be closed-source “black boxes” that have features that are not publicly disclosed; these features must be documented for certification to be tenable. Furthermore, existing real-time task models have not evolved to handle historical-result requirements common in computer-vision (CV) applications, which introduce cycles in processing graphs; existing models must be extended to account for such dependencies. Additionally, due to size, weight, power, and cost constraints, multiple CV applications may need to share a single hardware platform; if the platform contains accelerators such as non-preemptive GPUs, such sharing must be managed in a way that ensures applications are isolated from one another. For ADAS certification to be possible, these challenges must be addressed. This dissertation addresses each of these three challenges. First, scheduling details of NVIDIA GPU are presented, as derived through extensive micro-benchmarking experiments. These details provide the foundation for identifying and automatically detecting key issues when using NVIDIA GPUs in real-time safety-critical applications. Second, a generalization of a real-time task model is introduced, enabling the computation of response-time bounds for processing graphs that contain cycles. This model exposes a trade-off between the age of historical data, the resulting response-time bounds, and the accuracy of the CV application; this trade-off is explored in detail. Finally, a time-partitioning framework for multicore+accelerator platforms is introduced. When applied alongside existing methods for alleviating spatial interference, this framework can help enable component-wise ADAS certification on multicore+accelerator platforms.Doctor of Philosoph

    Using Lock Servers to Scale Real-Time Locking Protocols: Chasing Ever-Increasing Core Counts

    Get PDF
    During the past decade, parallelism-related issues have been at the forefront of real-time systems research due to the advent of multicore technologies. In the coming years, such issues will loom ever larger due to increasing core counts. Having more cores means a greater potential exists for platform capacity loss when the available parallelism cannot be fully exploited. In this paper, such capacity loss is considered in the context of real-time locking protocols. In this context, lock nesting becomes a key concern as it can result in transitive blocking chains that force tasks to execute sequentially unnecessarily. Such chains can be quite long on a larger machine. Contention-sensitive real-time locking protocols have been proposed as a means of "breaking" transitive blocking chains, but such protocols tend to have high overhead due to more complicated lock/unlock logic. To ease such overhead, the usage of lock servers is considered herein. In particular, four specific lock-server paradigms are proposed and many nuances concerning their deployment are explored. Experiments are presented that show that, by executing cache hot, lock servers can enable reductions in lock/unlock overhead of up to 86%. Such reductions make contention-sensitive protocols a viable approach in practice

    Using Lock Servers to Scale Real-Time Locking Protocols: Chasing Ever-Increasing Core Counts (Artifact)

    Get PDF
    During the past decade, parallelism-related issues have been at the forefront of real-time systems research due to the advent of multicore technologies. In the coming years, such issues will loom ever larger due to increasing core counts. Having more cores means a greater potential exists for platform capacity loss when the available parallelism cannot be fully exploited. In this work, such capacity loss is considered in the context of real-time locking protocols. In this context, lock nesting becomes a key concern as it can result in transitive blocking chains that force tasks to execute sequentially unnecessarily. Such chains can be quite long on a larger machine. Contention-sensitive real-time locking protocols have been proposed as a means of ``breaking\u27\u27 transitive blocking chains, but such protocols tend to have high overhead due to more complicated lock/unlock logic. To ease such overhead, the usage of lock servers is considered herein. In particular, four specific lock-server paradigms are proposed and many nuances concerning their deployment are explored. Experiments are presented that show that, by executing cache hot, lock servers can enable reductions in lock/unlock overhead of up to 86%. Such reductions make contention-sensitive protocols a viable approach in practice. This artifact contains the implementation of two contention-sensitive locking protocol variants implemented with four proposed lock-server paradigms, as well as the experiments with which they were evaluated

    Avoiding Pitfalls when Using NVIDIA GPUs for Real-Time Tasks in Autonomous Systems

    Get PDF
    NVIDIA\u27s CUDA API has enabled GPUs to be used as computing accelerators across a wide range of applications. This has resulted in performance gains in many application domains, but the underlying GPU hardware and software are subject to many non-obvious pitfalls. This is particularly problematic for safety-critical systems, where worst-case behaviors must be taken into account. While such behaviors were not a key concern for earlier CUDA users, the usage of GPUs in autonomous vehicles has taken CUDA programs out of the sole domain of computer-vision and machine-learning experts and into safety-critical processing pipelines. Certification is necessary in this new domain, which is problematic because GPU software may have been developed without any regard for worst-case behaviors. Pitfalls when using CUDA in real-time autonomous systems can result from the lack of specifics in official documentation, and developers of GPU software not being aware of the implications of their design choices with regards to real-time requirements. This paper focuses on the particular challenges facing the real-time community when utilizing CUDA-enabled GPUs for autonomous applications, and best practices for applying real-time safety-critical principles
    corecore